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Abstract — This paper analises distributed evolutionary com- 
putation based on the Representational State Transfer (REST) 
protocol, which overlays a farming model on evolutionary com- 
putation. An approach to evolutionary distributed optimisation 
of multilayer perceptrons (MLP) using REST and language 
Perl has been done. In these experiments, a master-slave based 
evolutionary algorithm (EA) has been implemented, where slave 
processes evaluate the costly fitness function (training a MLP to 
solve a classification problem). Obtained results show that the 
parallel version of the developed programs obtains similar or 
better results using much less time than the sequential version, 
obtaining a good speedup. 

L Introduction 

Service Oriented Architecture (SOA) [1 1 is a paradigm for 
organizing and utilizing distributed computational resources, 
called services. Using this paradigm, the service providers 
publish the descriptions (or interfaces) of the services they 
offer in a service registry, so that the service requesters can 
discover them and bind to the correspondant service provider 
Web Services are the key point of integration for different 
applications belonging to different platforms, languages and 
systems since they are based in a set of standards that make 
them independent of the underlaying technologies used for 
providing them. 

Although there are several technologies for developing 
web services (SOAP, REST or XMLRPC among others 
0, Q), nowadays the main approaches are SOAP (Simple 
Object Access Protocol) [4], [5| and REST (Representational 
State Transfer) |6|. 

SOAP is the traditional, standards-based approach, but the 
majority of the web services with public API offer REST 
interfaces, while some of them offer both REST and SOAP 
and very few offer just SOAP. 

All of the major Web Services providers use REST: 
Twitter, Yahoo's, Flickr, del.icio.us, pubsub, bloglines, tech- 
norati, and several others. Both eBay and Amazon have Web 
Services for both REST and SOAP 

On the other hand, SOAP Web Services are used in lots of 
enterprise software as well; for example, Google implements 
their Web Services using SOAP, with the exception of 
Blogger, which uses XML-RPC, an early and simpler pre- 
standard of SOAP 

The philosophies of SOAP and RESTful Web Services are 
very different. Strictly, SOAP is a protocol for distributed 
computing, whereas REST adheres much more closely to a 
web-based design. SOAP requires a greater implementation 
and understanding effort from the client side in difference to 
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REST based APIs, which focuses these efforts on the server 
side. 

It is important to note that one of the advantages of SOAP 
is the use of a "generic" transport. While REST today uses 
HTTP/HTTPS, SOAP can use almost any transport to send 
the request. However, one perceived disadvantage is the use 
of XML because of its verbosity, and the time necessary to 
parse it. 

This work continues with our previous research in ser- 
vice oriented algorithms, as previously stated in iQ, where 
a service-oriented platform was presented, or [IS], where 
studies about P2P distributed evolutionary algorithms were 
performed. 

In this paper we propose using REST for distributed 
computation, demonstrating how it could be used for evolu- 
tionary computation. Our aim is to implement a distributed 
evolutionary algorithm (EA) using Perl and REST, to solve 
a costly problem: tuning learning parameters and to set the 
initial weights and hidden layer size of a multilayer per- 
ceptron (MLP), based on an EA and Quick Propagation |9] 
(QP) to solve classification problems. This paper continues 
the research on evolutionary optimisation of MLP (G-Prop 
method) presented in ITOl . ifTTl . This method leverages the 
capabilities of two classes of algorithms: the ability of EA to 
find a solution close to the global optimum, and the ability 
of the back-propagation algorithm (BP) to tune a solution 
and reach the nearest local minimum by means of local 
search from the solution found by the EA. Instead of using a 
pre-established topology, the population is initialised with 
different hidden layer sizes, with some specific operators 
designed to change them (mutation, multi-point crossover, 
addition and elimination of hidden units, and QP training 
applied as operator). The EA searches and optimises the 
architecture (number of hidden units), the initial weight 
setting for that architecture and the learning rate for that 
net. 

The main idea of this paper, which is basically a proof 
of concept, is to see what are the possibilities of this setup 
as a meta-computer by implementing an EA using it, and 
then measuring the speedup when several computers are 
used at the same time. The problem we will attempt to 
solve is a costly classification problem, so that it takes time 
enough to get some improvement from parallelization. We 
will only try to measure how running time scales when new 
(heterogeneous) nodes are added to the system, being the 
main objective to test if this kind of system is suitable for 
scientific computation. 

The rest of this paper is structured as follows: Section HI] 
presents a comprehensive description of REST technology. 



Main paradigms of parallel and distributed evolutionary 
computation are reviewed in Section |III] Then, section HV] 
describes the proposed method, based on a farming model. 
Section |V] details the experimental setup and presents ob- 
tained results. Finally, a brief conclusion and future work is 
presented in section |VT] 

II. REST: Representational State Transfer 

After some years, Internet architects have found an al- 
ternative method for building web services in the form of 
Representational State Transfer (REST) [61 . 

REST is a style of software architecture for distributed 
hypermedia systems such as the World Wide Web. The term 
Representational State Transfer was introduced and defined 
in 2000 by Roy Fielding in his doctoral dissertation lfT2l . 
jJTl. Fielding is one of the principal authors of the Hypertext 
Transfer Protocol (HTTP) specification versions 1.0 and 1.1 

m, ma. 

REST-style architectures consist of clients and servers. 
Clients initiate requests to servers; servers process requests 
and return appropriate responses. Requests and responses 
are built around the transfer of representations of resources. 
A resource can be essentially any coherent and meaningful 
concept that may be addressed. 

Although REST was initially described in the context of 
HTTP, is not limited to that protocol. RESTful architectures 
can be based on other Application Layer protocols if they al- 
ready provide a rich and uniform vocabulary for applications 
based on the transfer of meaningful representational state. 
RESTful applications maximize the use of the pre-existing, 
well-defined interface and other built-in capabilities provided 
by the chosen network protocol, and minimize the addition 
of new application-specific features on top of it. 

In a REST environment, clients are not concerned with 
data storage, which remains internal to each server, so that 
the portability of client code is improved. Servers are not 
concerned with the user interface or user state, so that 
servers can be simpler and more scalable. Servers and clients 
may also be replaced and developed independently, as long 
as the interface is not altered. Finally, servers are able to 
temporarily extend or customize the functionality of a client 
by transferring logic to it that it can execute. 

The client-server communication is further constrained by 
no client context being stored on the server between requests. 
Each request from a client contains all of the information 
necessary to serve the request, and any session state is held 
in the client. The server can be stateful; this constraint merely 
requires that server-side state be addressable by URL as 
a resource. This not only makes servers more visible for 
monitoring, but also makes them more reliable in the face of 
partial or network failures as well as further enhancing their 
scalability. 

Main REST web services features are: 

• Simple and lightweight (not a lot of extra XML markup) 

• Human readable format 

• Easy to build (no toolkits required) 

• High performance 



III. Parallel and Distributed Evolutionary 
Algorithms 

We are concentrating on parallel and distributed evolu- 
tionary computation applications, which has already been 
adapted to several paradigms of parallel and distributed 
computing (for example, Jini [16], JavaSpaces iFTTl . Java with 
applets [18J, MPI 119.1 . service oriented architectures ll20l and 
P2P ll2T]| ). 

There are many ways to implement a distributed EA, one 
of which is the island model (migration): the population is 
divided into small subpopulations of the same size assigned 
to different processors. From time to time each processor 
selects the best individuals in its subpopulation and it sends 
them to his nearer processors, receiving as well copies of the 
best individuals of his neighbours (migration of individuals). 
All processors replace the worst individuals of their popula- 
tions. This kind of algorithms is also known as distributed 
EAs (Tanese |l22|, Pettey et al. CD, Cantu-Paz and Goldberg 
M)- 

Another alternative implementation is global paralelization 
{fanning) 1251 . Il26l . Il27l . in which individual evaluation 
and/or genetic operator application are parallelized. The 
global model does not divide the population. Instead, such 
an approach employs the inherent parallelism of evolution- 
ary algorithms (population of individuals). The calculations 
where the whole population is needed (fitness assignment 
and selection) are performed by the master and all remaining 
calculations which are performed for one or two individuals 
can be distributed to a number of slaves. The slaves can 
perform recombination, mutation and the evaluation of the 
objective function separately (these calculations can be done 
in parallel). This is known as synchronous master-slave 
structure. A nearly linear speedup of the calculation time may 
be achieved (as long as the evaluation time of the objective 
function is higher than the communication time between 
master and slaves). The global model is a simple way (and 
inherent to every evolutionary algorithm) to reduce very long 
computation times. 

Although many approaches to distributed EAs |28 | can 
be found in bibliography, in this paper we do not intend to 
innovate in that sense, but in the implementation (because 
implementation matters ll29l ). 

IV. Master-Slave based EA implementation using 
REST AND Perl 

An ideal client-server implementation of a distributed EA 
could be a server process with several threads. Each thread 
would include a population, and would communicate with 
other threads through the shared code among them. Each 
thread would use an own tail of individuals to send to 
other threads. Each thread would evaluate its individuals in 
different remote computers, carrying out the communication 
using a REST server 

However, as we cannot use a threaded version of the 
Perl modules, our implementation will focus on the most 
time consuming operation in G-Prop: the fitness function 



use Dancer; 




my $src = ""; 
get 7' => sub { 

return "Hello World"; 


use LWP; 

$c = new LWP::UserAgent; 
$c->agent("RESTzilla"); 


1; 

get 7uploadcode/:code' => sub { 
$src = params->{code}; 
return "ok"; 


$r = new HTTP::Request GET => 

'http://127.0.0.1:3000/downloadcode/;; 
$u = $c->request($r); 


# shows the fitness function received 


}; 

get Vdownloadcode/' => sub { 
return $src; 

}; 


print $u->content; 

# evaluate the received function 

eval( $u->content ); 


Dancer- >dance; 




Fig. 2 



REST PROGRAMMING EXAMPLE: SERVER (LEFT) AND CLIENT (RIGHT). IN THIS EXAMPLE THE REST SERVER DEPLOYS THREE SERVICES, WHILE 
THE CLIENT FIRST OBTAINS THE FITNESS FUNCTION AS PERL SOURCE CODE BY CALLING THE CORRESPONDING SERVICE, AND THEN EVALUATES 
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Fig. 1 

Schema of the master-slave based GA implemented in the 
second experiment. the master process runs the ga and the 
slave processes evaluate the fitness function. 



evaluation. The whole evolutionary algorithm is run on the 
master and only the objective function is sent to the slaves 
for evaluation (as shown in Figure 

The whole system can be sketched as follows: 

1) The EA process sends the fitness function code to the 
REST server and creates the EA population. 

2) Some clients connect the REST server and load the 
fitness function sent as Perl code from the server (Fig- 
ure |2] shows an example of server and client processes 
implementations to upload and download the fitness 
function source code). 

3) The EA process sends non-evaluated individuals to the 
server 

4) The clients ask for individuals to the server in order to 
evaluate them. 

5) The clients evaluate individuals and send the result 



back to the server 

6) The EA process obtains evaluated individuals from the 
server and continues the evolutionary loop. 

7) The EA terminates after a fixed number of generations 
(it sends a termination message throughout the REST 
server to the clients that remain ready to attend new 
workloads). 

The server in these experiments is mainly used for schedul- 
ing and balancing the tasks among the different clients; 
the network itself is used for communication, but all the 
interchange of information among clients must be cleared by 
the central server. However, one of the objectives of the work 
presented in this paper has been to create an infrastructure 
that would get rid of the bottleneck represented by the central 
server in these experiments. 

Implementation was carried out using the Perl Dancer 
module [301 . li31J for the Perl programming language, for its 
stability and the familiarity of the authors with this language 
ll32l . il33l . ||29l . In addition, servers are easy to implement 
and deploy. 

As an example. Figure |2] shows the source code of a 
REST server that deploys three services, and a client that 
obtains the fitness function (as Perl source code) by calling 
the corresponding service (and then evaluates that function). 

The evolutionary algorithm has been implemented 
using the Algorithm: :Evolutionary (A::E) library ll32l . 
Version 0.76.2 is used in this work, available at 



http://op8al.sourc8forg8.n8t under GPL Hcense. 

The full source code (servers, GA and evaluators) and 
experiment data are available under GPL at: 
http://atc.ugr.8s/p8dro/GProp-REST.tgz 

In this work, we adapt G-Prop as a distributed EA using 
REST following the detailed structure. G-Prop method has 



been fully described and analysed out in previous papers (see 
pOl, flT|), thus we refer to these papers for further details. In 
most cases, evolved MLP should be coded into chromosomes 
to be handled by the genetic operators, however, G-Prop uses 
no binary codification, instead, the initial parameters of the 
network are evolved using specific variation operators such 
as mutation, multi-point crossover, addition and elimination 
of hidden units, and QP training applied as operator to 
the individuals of the population. The EA optimises the 
classification ability of the MLP, and at the same time it 
searches for the number of hidden units (architecture), the 
initial weight setting and the learning rate for that net. 

Only "default" parameters have been used (genetic op- 
erators were applied using the same application rate). No 
parameter tuning has been done, since we do not intend 
to find the optimal ones, but to prove feasibiUty of the 
implementation. 

V. Experimental setup and results 

The tests used to assess the accuracy of a method must 
be chosen carefully, because some of them (toy problems) 
are not suitable for certain capacities of the BP algorithm, 
such as generalization f35l. Our opinion, along with Prechelt 
[361, is that, in order to test an algorithm, real world problems 
should be used. 

A. The "Glass" Classification Problem 

This problem consists of the classification of glass types, 
and is also taken from [36 1. The results of a chemical analysis 
of glass splinters (percent content of 8 different elements) 
plus the refractive index are used to classify the sample 
to be either float processed or non float processed building 
windows, vehicle windows, containers, tableware, or head 
lamps. This task is motivated by forensic needs in criminal 
investigation. This dataset was created based on the glass 
problem dataset from the UCI repository of machine learning 
databases. The data set contains 214 instances. Each sample 
has 9 attributes plus the class attribute: refractive index, 
sodium, magnesium, aluminium, silicon, potassium, calcium, 
barium, iron, and the class attribute (type of glass). 

The main data set was divided into three disjoint parts, 
for training, validating and testing. In order to obtain the 
fitness of an individual, the MLP (in the slave processes) is 
trained with the training set and its fitness is established from 
the classification error with the validating set. Once the EA 
(in the master process) is finished, when it reaches the limit 
of generations, the classification error with the testing set is 
calculated: this is the result shown in tables. 

Up to 4 computers have been used to run the algorithm and 
to obtain results both in sequential and parallel versions of 
the program. Experiments were conducted running the server 
process on a Ubuntu/Linux machine, while the clients were 
run on a Windows 7 with the Cygwii{3 environment and on 
Ubuntu/Linux machines. Computer speeds range from L5 
Ghz to 2 Ghz and are connected using the ethernet network 

' http://www.cygwin.com 



of the university (with a high communication latency, i.e. an 
average ping of 7 ms). No experiments using homogeneous 
computer network have been done, because our aim is to 
demonstrate potential of distributed EA using web services. 

As stated before, the EA was executed using the "default" 
parameter values (shown in Table |l|i. 



TABLE I 

List of parameters used to execute the EA. 



Parameter 


Value 


number of generations 


100 


individuals in the population 


100 


% of the population replaced 


30% 


number of hidden units 


ranging from 2 to 90 


epochs to calculate fitness 


300 



B. Obtained Results 

Time was measured using the "gettimeofday" function in 
order to achieve a good precision. Time taken to run the EA 
is reported in Table HH Sequential version of the program was 
run in the faster machine; and in parallel runs, the EA (master 
process) was run on the faster machine while the evaluators 
were run on slower machines. In this experiment we are not 
interested on comparing results against other authors, but in 
using a costly problem that justifies using a farming model. 

Results obtained can be shown in Table [III 

TABLE II 

Results (error % and time) obtained using both the sequential and 

THE parallel VERSIONS (UP TO 4 EVALUATORS-SLAVES ARE USED IN THE 
farming MODEL). COMPARABLE CLASSIFICATION ABILITY IS OBTAINED, WHILE 
TIME IS IMPROVED AS THE NUMBER OF EVALUATORS IS INCREASED. 



Model 


Error (%) 


Time (seconds) 


Sequential 


33 ± 2 


1215 ± 104 


Master-slave 


1 eval. 


33 ± 3 


1308 ± 114 


2 eval. 


32 ± 3 


719 ± 96 


3 eval. 


32 ± 2 


522 ± 87 


4 eval. 


32 ± 3 


424 ± 92 



Classification errors show a comparable algorithmic result. 
However, better results in time are obtained parallelizing the 
problem between several computers. 

Figure |3] shows that speedup does not equals the number 
of computers used; however, simulation time is improved 
using several computers. Thus, as adding new evaluators 
(heterogeneous computers running a Perl process) is an easy 
and costless task, we could take advantage of this system 
structure to solve costly optimization problems. Moreover, 
results could be better if a dedicated communication network 
was used, however, the university ethernet network is over- 
loaded and that implies a high latency in communications 
between processes. 

VI. Conclusions and Work in Progress 

This paper presents a new parallel-distributed computation 
implementation using REST and web services that shows the 




Fig. 3 

Plot of the speedup (dashed line) and f{x) = x function (solid 
LINE). Although speedup is not lineal, it can be seen that 
simulation time is improved using several computers as 
clients dedicated to evaluate the inidvidual fitness 

FUNCTIONS increase. 



useful this new technology can be in the field of evolutionary 
computation. 

To implement and use communications using REST it 
is not necessary running virtual machines (as in Java pro- 
gramming), nor daemons, just only to install several libraries 
available for almost any programming language. Moreover, 
an arbitrary number of computers (clients-evaluators) can be 
added to the system, making it more efficient. 

In these experiments, we have demonstrated that REST can 
be used as communication protocol for distributed evolution- 
ary computation, obtaining a good speedup. Results could 
improve using a dedicated communication network instead 
of the overloaded network of the university. 

REST provides a common interface that can be called from 
almost any programming language. Thus, programs can be 
written in any language and can share data without the need 
of worrying about the message formats or communication 
protocols. 

At the same time, it does not overload too much the 
network. Using other distributed systems, such as Jini ll37l . 
IJSl, the network traffic is so high that when a high number 
of computers are used, communication becomes difficult. 

A future in which different remote computers offer ser- 
vices to the scientific community can be imagined: for 
example, all the services available at the moment by means 
of HTML forms could be implemented easily as services. 

As future research, it is very important adding support 
for REST to existing distributed EA libraries in order to 
allow the implementation of multi-language EAs. Another 
possibility is to test P2P architectures, where each com- 



puter communicates only with one or two computers in 
the network. It would be very interesting to parallelize the 
proposed method using random topologies, in such a way that 
a "servent" (server/client) can enter or leave the network at 
any moment. 
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